GradQuant: Introduction to Network Analysis in igraph

Network Analysis Introduction

What is a network?

A network graph depicts interconnections between nodes.

The presence or absence of each interconnection indicates whether there exists some form of connection between each pair of nodes.

Examples

  • Google Maps
  • Friendships between individuals
  • Flight routes between cities
  • Connections between brain regions.
  • Communications networks, telephone, satellite, cable and internet
  • Discrete words/phrases and their linguistic relations
  • Atomic or molecular structures

Install and Load Relevant Packages

# install.packages("igraph") # For network analysis
# install.packages("igraphdata") # For sample datasets
# install.packages("RColorBrewer") # For colors
library(igraph)
library(igraphdata)
library(RColorBrewer)
data("UKfaculty")

Structures

Below is a randomly created adjacency matrix. ‘1’ denotes an edge (i.e., connection), and ‘0’ denotes the absence of an edge

set.seed(52)
pilotAdj <- sapply(1:10, function(x) sample(c(0,1),10,replace=T))
diag(pilotAdj) <- 0 # No self-connections
print(pilotAdj)
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]    0    0    0    1    0    1    0    0    1     0
 [2,]    1    0    1    0    1    0    1    1    0     0
 [3,]    0    0    0    0    1    1    0    1    1     0
 [4,]    0    1    0    0    0    0    0    0    0     1
 [5,]    1    0    1    0    0    0    0    1    1     0
 [6,]    0    1    0    0    1    0    1    0    1     0
 [7,]    0    1    1    0    1    0    0    1    0     0
 [8,]    1    1    1    1    0    1    0    0    0     0
 [9,]    0    1    0    1    1    1    1    0    0     1
[10,]    1    1    0    0    1    1    0    0    1     0

This may seem abstract so let’s add some labels to the nodes to make it more meaningful:

people <- c("Luke","Anne","Zee","Bob","Tim","Jazmin","Juan","Kalani","Liz","Brent")
colnames(pilotAdj) <- people
rownames(pilotAdj) <- people
pilotAdj
       Luke Anne Zee Bob Tim Jazmin Juan Kalani Liz Brent
Luke      0    0   0   1   0      1    0      0   1     0
Anne      1    0   1   0   1      0    1      1   0     0
Zee       0    0   0   0   1      1    0      1   1     0
Bob       0    1   0   0   0      0    0      0   0     1
Tim       1    0   1   0   0      0    0      1   1     0
Jazmin    0    1   0   0   1      0    1      0   1     0
Juan      0    1   1   0   1      0    0      1   0     0
Kalani    1    1   1   1   0      1    0      0   0     0
Liz       0    1   0   1   1      1    1      0   0     1
Brent     1    1   0   0   1      1    0      0   1     0

Adjacency Matrix

In the adjacency matrix representation, a graph is represented in the form of a two-dimensional array. The size of the array is V x V, where V is the set of vertices. The following image represents the adjacency matrix representation:

Converting an Adjacency Matrix to a Graph

An Adjacency list is an array consisting of the address of all the linked lists. The first node of the linked list represents the vertex and the remaining lists connected to this node represents the vertices to which this node is connected. This representation can also be used to represent a weighted graph. The linked list can slightly be changed to even store the weight of the edge.

pilotGraph <- graph_from_adjacency_matrix(pilotAdj)
pilotGraph
IGRAPH 397f698 DN-- 10 42 -- 
+ attr: name (v/c)
+ edges from 397f698 (vertex names):
 [1] Luke  ->Bob    Luke  ->Jazmin Luke  ->Liz    Anne  ->Luke   Anne  ->Zee   
 [6] Anne  ->Tim    Anne  ->Juan   Anne  ->Kalani Zee   ->Tim    Zee   ->Jazmin
[11] Zee   ->Kalani Zee   ->Liz    Bob   ->Anne   Bob   ->Brent  Tim   ->Luke  
[16] Tim   ->Zee    Tim   ->Kalani Tim   ->Liz    Jazmin->Anne   Jazmin->Tim   
[21] Jazmin->Juan   Jazmin->Liz    Juan  ->Anne   Juan  ->Zee    Juan  ->Tim   
[26] Juan  ->Kalani Kalani->Luke   Kalani->Anne   Kalani->Zee    Kalani->Bob   
[31] Kalani->Jazmin Liz   ->Anne   Liz   ->Bob    Liz   ->Tim    Liz   ->Jazmin
[36] Liz   ->Juan   Liz   ->Brent  Brent ->Luke   Brent ->Anne   Brent ->Tim   
+ ... omitted several edges

Converting an igraph to an edgelist

An edge list is a list or array of all the edges in a graph. Edge lists are one of the easier representations of a graph. In this implementation, the underlying data structure for keeping track of all the nodes and edges is a single list of pairs. Each pair represents a single edge and is comprised of the two unique IDs of the nodes involved. Each line/edge in the graph gets an entry in the edge list, and that single data structure then encodes all nodes and relationships.

pilotEList <- as_edgelist(pilotGraph)
pilotEList
      [,1]     [,2]    
 [1,] "Luke"   "Bob"   
 [2,] "Luke"   "Jazmin"
 [3,] "Luke"   "Liz"   
 [4,] "Anne"   "Luke"  
 [5,] "Anne"   "Zee"   
 [6,] "Anne"   "Tim"   
 [7,] "Anne"   "Juan"  
 [8,] "Anne"   "Kalani"
 [9,] "Zee"    "Tim"   
[10,] "Zee"    "Jazmin"
[11,] "Zee"    "Kalani"
[12,] "Zee"    "Liz"   
[13,] "Bob"    "Anne"  
[14,] "Bob"    "Brent" 
[15,] "Tim"    "Luke"  
[16,] "Tim"    "Zee"   
[17,] "Tim"    "Kalani"
[18,] "Tim"    "Liz"   
[19,] "Jazmin" "Anne"  
[20,] "Jazmin" "Tim"   
[21,] "Jazmin" "Juan"  
[22,] "Jazmin" "Liz"   
[23,] "Juan"   "Anne"  
[24,] "Juan"   "Zee"   
[25,] "Juan"   "Tim"   
[26,] "Juan"   "Kalani"
[27,] "Kalani" "Luke"  
[28,] "Kalani" "Anne"  
[29,] "Kalani" "Zee"   
[30,] "Kalani" "Bob"   
[31,] "Kalani" "Jazmin"
[32,] "Liz"    "Anne"  
[33,] "Liz"    "Bob"   
[34,] "Liz"    "Tim"   
[35,] "Liz"    "Jazmin"
[36,] "Liz"    "Juan"  
[37,] "Liz"    "Brent" 
[38,] "Brent"  "Luke"  
[39,] "Brent"  "Anne"  
[40,] "Brent"  "Tim"   
[41,] "Brent"  "Jazmin"
[42,] "Brent"  "Liz"   

Adjacency List

In the adjacency list representation, a graph is represented as an array of linked list. The index of the array represents a vertex and each element in its linked list represents the  vertices that form an edge with the vertex. The following image represents the adjacency list representation: 

Converting to adjacency list

as_adj_list(pilotGraph)
$Luke
+ 7/10 vertices, named, from 397f698:
[1] Anne   Bob    Tim    Jazmin Kalani Liz    Brent 

$Anne
+ 11/10 vertices, named, from 397f698:
 [1] Luke   Zee    Bob    Tim    Jazmin Juan   Juan   Kalani Kalani Liz   
[11] Brent 

$Zee
+ 8/10 vertices, named, from 397f698:
[1] Anne   Tim    Tim    Jazmin Juan   Kalani Kalani Liz   

$Bob
+ 5/10 vertices, named, from 397f698:
[1] Luke   Anne   Kalani Liz    Brent 

$Tim
+ 10/10 vertices, named, from 397f698:
 [1] Luke   Anne   Zee    Zee    Jazmin Juan   Kalani Liz    Liz    Brent 

$Jazmin
+ 9/10 vertices, named, from 397f698:
[1] Luke   Anne   Zee    Tim    Juan   Kalani Liz    Liz    Brent 

$Juan
+ 7/10 vertices, named, from 397f698:
[1] Anne   Anne   Zee    Tim    Jazmin Kalani Liz   

$Kalani
+ 9/10 vertices, named, from 397f698:
[1] Luke   Anne   Anne   Zee    Zee    Bob    Tim    Jazmin Juan  

$Liz
+ 11/10 vertices, named, from 397f698:
 [1] Luke   Anne   Zee    Bob    Tim    Tim    Jazmin Jazmin Juan   Brent 
[11] Brent 

$Brent
+ 7/10 vertices, named, from 397f698:
[1] Luke   Anne   Bob    Tim    Jazmin Liz    Liz   

Converting igraph back to an adjacency matrix

Notably, you might see that an adjacency matrix is a “sparse matrix” and is distinct from the base R class of “matrix”. To convert it, you will need to go a step further.

pilotAdj <- as_adjacency_matrix(pilotGraph)
pilotAdj
10 x 10 sparse Matrix of class "dgCMatrix"
                          
Luke   . . . 1 . 1 . . 1 .
Anne   1 . 1 . 1 . 1 1 . .
Zee    . . . . 1 1 . 1 1 .
Bob    . 1 . . . . . . . 1
Tim    1 . 1 . . . . 1 1 .
Jazmin . 1 . . 1 . 1 . 1 .
Juan   . 1 1 . 1 . . 1 . .
Kalani 1 1 1 1 . 1 . . . .
Liz    . 1 . 1 1 1 1 . . 1
Brent  1 1 . . 1 1 . . 1 .

Converting sparse matrix to base matrix

pilotBaseM <- as.matrix(pilotAdj)
pilotBaseM
       Luke Anne Zee Bob Tim Jazmin Juan Kalani Liz Brent
Luke      0    0   0   1   0      1    0      0   1     0
Anne      1    0   1   0   1      0    1      1   0     0
Zee       0    0   0   0   1      1    0      1   1     0
Bob       0    1   0   0   0      0    0      0   0     1
Tim       1    0   1   0   0      0    0      1   1     0
Jazmin    0    1   0   0   1      0    1      0   1     0
Juan      0    1   1   0   1      0    0      1   0     0
Kalani    1    1   1   1   0      1    0      0   0     0
Liz       0    1   0   1   1      1    1      0   0     1
Brent     1    1   0   0   1      1    0      0   1     0

Simple Inspection

Show Vertices

V(pilotGraph)
+ 10/10 vertices, named, from 397f698:
 [1] Luke   Anne   Zee    Bob    Tim    Jazmin Juan   Kalani Liz    Brent 

Show Edges

E(pilotGraph)
+ 42/42 edges from 397f698 (vertex names):
 [1] Luke  ->Bob    Luke  ->Jazmin Luke  ->Liz    Anne  ->Luke   Anne  ->Zee   
 [6] Anne  ->Tim    Anne  ->Juan   Anne  ->Kalani Zee   ->Tim    Zee   ->Jazmin
[11] Zee   ->Kalani Zee   ->Liz    Bob   ->Anne   Bob   ->Brent  Tim   ->Luke  
[16] Tim   ->Zee    Tim   ->Kalani Tim   ->Liz    Jazmin->Anne   Jazmin->Tim   
[21] Jazmin->Juan   Jazmin->Liz    Juan  ->Anne   Juan  ->Zee    Juan  ->Tim   
[26] Juan  ->Kalani Kalani->Luke   Kalani->Anne   Kalani->Zee    Kalani->Bob   
[31] Kalani->Jazmin Liz   ->Anne   Liz   ->Bob    Liz   ->Tim    Liz   ->Jazmin
[36] Liz   ->Juan   Liz   ->Brent  Brent ->Luke   Brent ->Anne   Brent ->Tim   
[41] Brent ->Jazmin Brent ->Liz   

Total Vertices

gorder(pilotGraph)
[1] 10

Total Edges

gsize(pilotGraph)
[1] 42

Basic Plot

plot(pilotGraph)

What is a Directed/Undirected Graph

  • Directed Graph: Directed graphs are a class of graphs that don’t presume symmetry or reciprocity in the edges established between vertices. In a directed graph, if a and b are two vertices connected by an edge (a,b), this doesn’t necessarily mean that an edge connecting (b,a) also exists

  • Undirected Graph: Symmetric; Undirected graphs are more specific. For them, there’s an extra assumption regarding the reciprocity in the relationship between pairs of vertices connected by an edge. If an edge (a,b) exists between two vertices a and b, the edge (b,a) also exists

Examples

Examples of directed graphs include parents and their offspring; Twitter

Examples of undirected graphs include pedestrian pathways, where movement between any two intersections of paths is possible in both directions; railways; Facebook; correlation matrices

Depiction of Directed Graph

Our prior adjacency matrix we generated is a directed graph.

print(pilotAdj)
10 x 10 sparse Matrix of class "dgCMatrix"
                          
Luke   . . . 1 . 1 . . 1 .
Anne   1 . 1 . 1 . 1 1 . .
Zee    . . . . 1 1 . 1 1 .
Bob    . 1 . . . . . . . 1
Tim    1 . 1 . . . . 1 1 .
Jazmin . 1 . . 1 . 1 . 1 .
Juan   . 1 1 . 1 . . 1 . .
Kalani 1 1 1 1 . 1 . . . .
Liz    . 1 . 1 1 1 1 . . 1
Brent  1 1 . . 1 1 . . 1 .

Depiction of Undirected Graph

set.seed(48)
pilotAdjUn <- sapply(1:10, function(x) sample(c(0,1),10,replace=T))
pilotAdjUn[lower.tri(pilotAdjUn)] = t(pilotAdjUn)[lower.tri(pilotAdjUn)]
diag(pilotAdjUn) <- 0
pilotAdjUn
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]    0    0    0    1    1    0    1    1    0     0
 [2,]    0    0    1    1    0    0    1    0    0     0
 [3,]    0    1    0    1    1    0    1    1    0     1
 [4,]    1    1    1    0    0    0    1    1    1     0
 [5,]    1    0    1    0    0    0    1    0    0     0
 [6,]    0    0    0    0    0    0    1    0    0     0
 [7,]    1    1    1    1    1    1    0    1    1     1
 [8,]    1    0    1    1    0    0    1    0    1     0
 [9,]    0    0    0    1    0    0    1    1    0     0
[10,]    0    0    1    0    0    0    1    0    0     0
pilotGraphUn <- graph_from_adjacency_matrix(pilotAdjUn, mode="undirected")
is.directed(pilotGraphUn)
[1] FALSE

Weighted/unweighted graphs

Unweighted Graphs

If we care only if two nodes are connected or not, we call such a graph unweighted. For the nodes with an edge between them, we say they are adjacent or neighbors of one another.

Weighted Graphs

In many applications, the edges have numerical properties that we need to exploit in our algorithms to solve the problem at hand. For example, we must consider road lengths and traffic density while looking for the shortest path between two cities. So, we associate each edge e with a real value w(e) that we call its weight. We call such graphs weighted.

Example

pilotAdjW <- pilotAdj
pilotAdjW[pilotAdjW==1] <- rnorm(length(pilotAdj[pilotAdj==1]))
pilotAdjW
10 x 10 sparse Matrix of class "dgCMatrix"
                                                                             
Luke    .          .             .          -1.0800526  .           1.8262535
Anne   -0.3033929  .             0.04874371  .         -0.89683593  .        
Zee     .          .             .           .         -0.41426188 -1.3985065
Bob     .         -0.5514779390  .           .          .           .        
Tim    -0.5062032  .            -0.43876859  .          .           .        
Jazmin  .          0.0003886532  .           .         -0.35249792  .        
Juan    .         -0.9084064952 -0.16432731  .         -0.63373542  .        
Kalani -0.6270644  0.7297462996  1.98264432  0.3165565  .          -1.1625891
Liz     .          1.3719967855  .          -1.0852256 -0.07191939 -0.8335206
Brent   1.1605596 -0.2095287169  .           .         -0.70486068  0.1487730
                                                  
Luke    .          .         -0.36631444 .        
Anne    1.2561356  0.9715358  .          .        
Zee     .          0.9341368 -0.07047117 .        
Bob     .          .          .          0.6058275
Tim     .          0.2990072 -1.26668638 .        
Jazmin -0.9197363  .          0.88531943 .        
Juan    .         -0.7463273  .          .        
Kalani  .          .          .          .        
Liz    -0.5573352  .          .          1.5574155
Brent   .          .          0.12432662 .        
pilotGraphW <- graph.adjacency(pilotAdjW, weighted = T)
is.weighted(pilotGraphW)
[1] TRUE

Attributes

Vertex Attributes

partyAtts <- sample(c("Dem","Rep"), gorder(pilotGraph) ,replace = T) # Random attributes
partyAtts # Inspect
 [1] "Dem" "Dem" "Rep" "Dem" "Rep" "Dem" "Rep" "Dem" "Dem" "Rep"
pilotGraph <-set_vertex_attr(pilotGraph, "party", value = partyAtts) # Assign attributes

Check Vertex Attributes

vertex_attr(pilotGraph) # Check vertex attributes
$name
 [1] "Luke"   "Anne"   "Zee"    "Bob"    "Tim"    "Jazmin" "Juan"   "Kalani"
 [9] "Liz"    "Brent" 

$party
 [1] "Dem" "Dem" "Rep" "Dem" "Rep" "Dem" "Rep" "Dem" "Dem" "Rep"
V(pilotGraph)$party # or check specific vertex attribute
 [1] "Dem" "Dem" "Rep" "Dem" "Rep" "Dem" "Rep" "Dem" "Dem" "Rep"

Edge Attributes

friendshipAtts <- rnorm(gsize(pilotGraph))
friendshipAtts
 [1] -0.32649141  0.41635369  0.01981900 -0.65516483  0.61464925  0.91149901
 [7] -0.39453189 -0.22837836 -0.68597361  2.33719057  0.43163957  0.58103320
[13]  1.88668458 -0.70263564  2.13005539 -0.18521698  1.30723730 -0.10481191
[19] -1.12511912  0.90967468  2.67016095  0.24883337 -1.94102816 -0.98296103
[25] -1.06348692  0.87797022 -0.27141087  0.47532044  0.86695653 -0.06351474
[31] -1.29585987 -0.56441804 -0.09103845 -0.69537905 -1.02493045  2.06080413
[37] -0.20601530  0.10293105  0.44809381  1.08050718  1.00996288  0.89496548
pilotGraph <-set_edge_attr(pilotGraph, "friendship", value = friendshipAtts) # Assign attributes

Check Attributes

edge_attr(pilotGraph) # Check vertex attributes
$friendship
 [1] -0.32649141  0.41635369  0.01981900 -0.65516483  0.61464925  0.91149901
 [7] -0.39453189 -0.22837836 -0.68597361  2.33719057  0.43163957  0.58103320
[13]  1.88668458 -0.70263564  2.13005539 -0.18521698  1.30723730 -0.10481191
[19] -1.12511912  0.90967468  2.67016095  0.24883337 -1.94102816 -0.98296103
[25] -1.06348692  0.87797022 -0.27141087  0.47532044  0.86695653 -0.06351474
[31] -1.29585987 -0.56441804 -0.09103845 -0.69537905 -1.02493045  2.06080413
[37] -0.20601530  0.10293105  0.44809381  1.08050718  1.00996288  0.89496548
E(pilotGraph)$friendship # Check specific edge attributes
 [1] -0.32649141  0.41635369  0.01981900 -0.65516483  0.61464925  0.91149901
 [7] -0.39453189 -0.22837836 -0.68597361  2.33719057  0.43163957  0.58103320
[13]  1.88668458 -0.70263564  2.13005539 -0.18521698  1.30723730 -0.10481191
[19] -1.12511912  0.90967468  2.67016095  0.24883337 -1.94102816 -0.98296103
[25] -1.06348692  0.87797022 -0.27141087  0.47532044  0.86695653 -0.06351474
[31] -1.29585987 -0.56441804 -0.09103845 -0.69537905 -1.02493045  2.06080413
[37] -0.20601530  0.10293105  0.44809381  1.08050718  1.00996288  0.89496548

Subsetting Graph

We may want to subset all nodes that are “Dem”:

V(pilotGraph)[party=="Dem"]
+ 6/10 vertices, named, from 397f698:
[1] Luke   Anne   Bob    Jazmin Kalani Liz   

We may want to create a subsettted graph of only “Dem”:

induced_subgraph(pilotGraph, vids=which(V(pilotGraph)$party=="Dem") )
IGRAPH a6f554a DN-- 6 15 -- 
+ attr: name (v/c), party (v/c), friendship (e/n)
+ edges from a6f554a (vertex names):
 [1] Luke  ->Bob    Luke  ->Jazmin Luke  ->Liz    Anne  ->Luke   Anne  ->Kalani
 [6] Bob   ->Anne   Jazmin->Anne   Jazmin->Liz    Kalani->Luke   Kalani->Anne  
[11] Kalani->Bob    Kalani->Jazmin Liz   ->Anne   Liz   ->Bob    Liz   ->Jazmin

We may want to subset all of the edges that include “Anne”:

E(pilotGraph)[[inc("Anne")]]
+ 11/42 edges from 397f698 (vertex names):
     tail   head tid hid friendship
4    Anne   Luke   2   1 -0.6551648
5    Anne    Zee   2   3  0.6146492
6    Anne    Tim   2   5  0.9114990
7    Anne   Juan   2   7 -0.3945319
8    Anne Kalani   2   8 -0.2283784
13    Bob   Anne   4   2  1.8866846
19 Jazmin   Anne   6   2 -1.1251191
23   Juan   Anne   7   2 -1.9410282
28 Kalani   Anne   8   2  0.4753204
32    Liz   Anne   9   2 -0.5644180
39  Brent   Anne  10   2  0.4480938

We may want to subset all edges with a friendship greater than 0:

E(pilotGraph)[[friendship>0]]
+ 22/42 edges from 397f698 (vertex names):
     tail   head tid hid friendship
2    Luke Jazmin   1   6  0.4163537
3    Luke    Liz   1   9  0.0198190
5    Anne    Zee   2   3  0.6146492
6    Anne    Tim   2   5  0.9114990
10    Zee Jazmin   3   6  2.3371906
11    Zee Kalani   3   8  0.4316396
12    Zee    Liz   3   9  0.5810332
13    Bob   Anne   4   2  1.8866846
15    Tim   Luke   5   1  2.1300554
17    Tim Kalani   5   8  1.3072373
20 Jazmin    Tim   6   5  0.9096747
21 Jazmin   Juan   6   7  2.6701609
22 Jazmin    Liz   6   9  0.2488334
26   Juan Kalani   7   8  0.8779702
28 Kalani   Anne   8   2  0.4753204
29 Kalani    Zee   8   3  0.8669565
36    Liz   Juan   9   7  2.0608041
38  Brent   Luke  10   1  0.1029310
39  Brent   Anne  10   2  0.4480938
40  Brent    Tim  10   5  1.0805072
41  Brent Jazmin  10   6  1.0099629
42  Brent    Liz  10   9  0.8949655

Incident

Any edges to or from a given vertex... Here, for “Anne”

incident(pilotGraph, "Anne", mode=c("all"))
+ 11/42 edges from 397f698 (vertex names):
 [1] Anne  ->Luke   Anne  ->Zee    Bob   ->Anne   Anne  ->Tim    Jazmin->Anne  
 [6] Anne  ->Juan   Juan  ->Anne   Anne  ->Kalani Kalani->Anne   Liz   ->Anne  
[11] Brent ->Anne  

Neighbors

We can subset any nodes that are neighboring a node.

neighbors(pilotGraph, "Anne", mode=c("all"))
+ 11/10 vertices, named, from 397f698:
 [1] Luke   Zee    Bob    Tim    Jazmin Juan   Juan   Kalani Kalani Liz   
[11] Brent 

What friends do Anne and Liz share in common?

We can use intersection to discover this.

afriends <- neighbors(pilotGraph, "Anne", mode=c("out"))
lfriends <- neighbors(pilotGraph, "Anne", mode=c("out"))
intersection(afriends, lfriends)
+ 5/10 vertices, named, from 397f698:
[1] Luke   Zee    Tim    Juan   Kalani

Nodes in N steps…

ego(pilotGraph, order=1, mindist = 0, nodes = "Bob") # 1 step away
[[1]]
+ 6/10 vertices, named, from 397f698:
[1] Bob    Luke   Anne   Kalani Liz    Brent 
ego(pilotGraph, order=1, mindist = 1, nodes = "Bob") # 1 step away, not including Bob which is identical to neighbors() output
[[1]]
+ 5/10 vertices, named, from 397f698:
[1] Luke   Anne   Kalani Liz    Brent 
ego(pilotGraph, order=2, mindist = 0, nodes = "Bob") # 2 steps away, not including Bob
[[1]]
+ 10/10 vertices, named, from 397f698:
 [1] Bob    Luke   Anne   Kalani Liz    Brent  Tim    Jazmin Zee    Juan  

Relationships and Size

What are the two farthest apart nodes?

The distance between these two nodes reflect the diameter of the network?

farthest_vertices(pilotGraph)
$vertices
+ 2/10 vertices, named, from 397f698:
[1] Luke Zee 

$distance
[1] 3

What is the diameter?

As you can see, the diameter returns the same distance as is observed in farthest vertices.

diameter(pilotGraph)
[1] 3

Distance of all nodes

You can extract a table of all distances among all nodes as well.

distances(pilotGraph)
       Luke Anne Zee Bob Tim Jazmin Juan Kalani Liz Brent
Luke      0    1   2   1   1      1    2      1   1     1
Anne      1    0   1   1   1      1    1      1   1     1
Zee       2    1   0   2   1      1    1      1   1     2
Bob       1    1   2   0   2      2    2      1   1     1
Tim       1    1   1   2   0      1    1      1   1     1
Jazmin    1    1   1   2   1      0    1      1   1     1
Juan      2    1   1   2   1      1    0      1   1     2
Kalani    1    1   1   1   1      1    1      0   2     2
Liz       1    1   1   1   1      1    1      2   0     1
Brent     1    1   2   1   1      1    2      2   1     0

Similarity

The vertex similarity coefficient of a pair of vertices is a measure of how similar are the immediate networks of those two vertices. Imagine we have two vertices V1 and V2 in an unweighted graph G. There are three common ways to calculate the vertex similarity of V1 and V2.

Jaccard Similarity

The Jaccard similarity coefficient is the number of vertices who are neighbors of both V1 and V2 divided by the number of vertices who are neighbors of at least one of V1 and V2.

jmat <- similarity.jaccard(pilotGraph)
jmat
       [,1] [,2]      [,3]  [,4]      [,5]      [,6]      [,7]      [,8]
 [1,] 1.000  0.6 0.6250000 0.500 0.5000000 0.5000000 0.6250000 0.4000000
 [2,] 0.600  1.0 0.5000000 0.400 0.7000000 0.7000000 0.5000000 0.6000000
 [3,] 0.625  0.5 1.0000000 0.375 0.5555556 0.5555556 0.7142857 0.4444444
 [4,] 0.500  0.4 0.3750000 1.000 0.6250000 0.6250000 0.3750000 0.2000000
 [5,] 0.500  0.7 0.5555556 0.625 1.0000000 0.7777778 0.5555556 0.5000000
 [6,] 0.500  0.7 0.5555556 0.625 0.7777778 1.0000000 0.5555556 0.5000000
 [7,] 0.625  0.5 0.7142857 0.375 0.5555556 0.5555556 1.0000000 0.4444444
 [8,] 0.400  0.6 0.4444444 0.200 0.5000000 0.5000000 0.4444444 1.0000000
 [9,] 0.500  0.7 0.4000000 0.300 0.6000000 0.6000000 0.4000000 0.8750000
[10,] 0.625  0.5 0.5000000 0.375 0.4000000 0.4000000 0.5000000 0.6250000
           [,9]     [,10]
 [1,] 0.5000000 0.6250000
 [2,] 0.7000000 0.5000000
 [3,] 0.4000000 0.5000000
 [4,] 0.3000000 0.3750000
 [5,] 0.6000000 0.4000000
 [6,] 0.6000000 0.4000000
 [7,] 0.4000000 0.5000000
 [8,] 0.8750000 0.6250000
 [9,] 1.0000000 0.5555556
[10,] 0.5555556 1.0000000

Dice Similarity

The dice similarity coefficient is twice the number of vertices who are neighbors of both v1�1 and v2�2 divided by the sum of the degree centralities of v1�1 and v2�2. Thus, common neighbors are double counted in this method.

dmat<-similarity.dice(pilotGraph)
dmat
           [,1]      [,2]      [,3]      [,4]      [,5]      [,6]      [,7]
 [1,] 1.0000000 0.7500000 0.7692308 0.6666667 0.6666667 0.6666667 0.7692308
 [2,] 0.7500000 1.0000000 0.6666667 0.5714286 0.8235294 0.8235294 0.6666667
 [3,] 0.7692308 0.6666667 1.0000000 0.5454545 0.7142857 0.7142857 0.8333333
 [4,] 0.6666667 0.5714286 0.5454545 1.0000000 0.7692308 0.7692308 0.5454545
 [5,] 0.6666667 0.8235294 0.7142857 0.7692308 1.0000000 0.8750000 0.7142857
 [6,] 0.6666667 0.8235294 0.7142857 0.7692308 0.8750000 1.0000000 0.7142857
 [7,] 0.7692308 0.6666667 0.8333333 0.5454545 0.7142857 0.7142857 1.0000000
 [8,] 0.5714286 0.7500000 0.6153846 0.3333333 0.6666667 0.6666667 0.6153846
 [9,] 0.6666667 0.8235294 0.5714286 0.4615385 0.7500000 0.7500000 0.5714286
[10,] 0.7692308 0.6666667 0.6666667 0.5454545 0.5714286 0.5714286 0.6666667
           [,8]      [,9]     [,10]
 [1,] 0.5714286 0.6666667 0.7692308
 [2,] 0.7500000 0.8235294 0.6666667
 [3,] 0.6153846 0.5714286 0.6666667
 [4,] 0.3333333 0.4615385 0.5454545
 [5,] 0.6666667 0.7500000 0.5714286
 [6,] 0.6666667 0.7500000 0.5714286
 [7,] 0.6153846 0.5714286 0.6666667
 [8,] 1.0000000 0.9333333 0.7692308
 [9,] 0.9333333 1.0000000 0.7142857
[10,] 0.7692308 0.7142857 1.0000000

Inverse Log Weighted Similarity

The inverse log-weighted similarity coefficient is the sum of the inverse logs of the degrees of the common neighbors of V1 and V2. This definition asserts that common neighbors that have high degree in the network are ‘less valuable’ in detecting similarity because there is a higher likelihood that they would be a common neighbor simply by chance.

ilwmat<-similarity.invlogweighted(pilotGraph)
ilwmat
          [,1]     [,2]     [,3]     [,4]     [,5]      [,6]      [,7]     [,8]
 [1,] 0.000000 3.351919 3.068013 1.803083 2.675235 2.6544096 2.5956309 2.344814
 [2,] 3.351919 1.938036 4.589016 2.355068 5.216814 4.7150902 2.6975841 4.014241
 [3,] 3.068013 4.589016 1.778828 1.744304 3.130354 3.5438237 3.4850450 2.671672
 [4,] 1.803083 2.355068 1.744304 0.000000 2.734013 2.7340135 1.7062168 1.347963
 [5,] 2.675235 5.216814 3.130354 2.734013 1.795861 5.0437733 3.5401655 4.240574
 [6,] 2.654410 4.715090 3.543824 2.734013 5.043773 0.8340648 3.0384420 3.257953
 [7,] 2.595631 2.697584 3.485045 1.706217 3.540165 3.0384420 0.8340648 3.519340
 [8,] 2.344814 4.014241 2.671672 1.347963 4.240574 3.2579526 3.5193404 1.795861
 [9,] 3.844992 5.450553 3.578348 1.958727 4.344662 3.8221131 3.0937913 5.223821
[10,] 2.761846 2.858712 2.574806 1.764996 3.054180 3.0333548 2.5575437 2.858712
          [,9]     [,10]
 [1,] 3.844992 2.7618462
 [2,] 5.450553 2.8587122
 [3,] 3.578348 2.5748058
 [4,] 1.958727 1.7649955
 [5,] 4.344662 3.0541799
 [6,] 3.822113 3.0333548
 [7,] 3.093791 2.5575437
 [8,] 5.223821 2.8587122
 [9,] 2.806625 3.3310939
[10,] 3.331094 0.8340648
cor(c(jmat), c(dmat))
[1] 0.9875015
cor(c(jmat), c(ilwmat))
[1] -0.007228666
cor(c(dmat), c(ilwmat))
[1] 0.08668629

Important and Influential Nodes: Centrality

Degree Centrality

Degree centrality assigns an importance score based simply on the number of links held by each node.

Simple, straightforward measure of overall connections. Can distinguish between out and in-degree centrality in directed graph.

degree(pilotGraph)
  Luke   Anne    Zee    Bob    Tim Jazmin   Juan Kalani    Liz  Brent 
     7     11      8      5     10      9      7      9     11      7 
degree(pilotGraph, mode="out")
  Luke   Anne    Zee    Bob    Tim Jazmin   Juan Kalani    Liz  Brent 
     3      5      4      2      4      4      4      5      6      5 
degree(pilotGraph, mode="in")
  Luke   Anne    Zee    Bob    Tim Jazmin   Juan Kalani    Liz  Brent 
     4      6      4      3      6      5      3      4      5      2 

Closeness Centrality

The closeness centrality of a vertex v in a connected graph is the inverse of the sum of the distances from v to all other vertices. Closeness centrality is a measure of how efficiently the entire graph can be traversed from a given vertex.

closeness(pilotGraph)
      Luke       Anne        Zee        Bob        Tim     Jazmin       Juan 
0.05882353 0.07142857 0.07142857 0.06250000 0.07142857 0.07142857 0.06666667 
    Kalani        Liz      Brent 
0.07692308 0.08333333 0.07692308 

Betweenness Centrality

The betweenness centrality of a vertex V is calculated by taking each pair of other vertices X and Y, calculating the number of shortest paths between X and Y that go through V, dividing by the total number of shortest paths between X and Y, then summing over all such pairs of vertices in the graph. Betweenness centrality is a measure of how important a given vertex is in connecting other pairs of vertices in the graph… how frequently the vertex lies on shortest paths between any two vertices in the network.

betweenness(pilotGraph)
     Luke      Anne       Zee       Bob       Tim    Jazmin      Juan    Kalani 
 2.983333  9.523810  3.116667  2.852381  5.938095  4.523810  1.904762  5.616667 
      Liz     Brent 
12.207143  3.333333 

Eigenvector Centrality

The Eigenvector centrality or relative centrality or prestige of a vertex is a measure of how connected the vertex is to other influential vertices. EigenCentrality measures a node’s influence based on the number of links it has to other nodes in the network. EigenCentrality then goes a step further by also taking into account how well connected a node is, and how many links their connections have, and so on through the network.

eigen_centrality(pilotGraph)$vector
     Luke      Anne       Zee       Bob       Tim    Jazmin      Juan    Kalani 
0.6664506 0.9890806 0.8248114 0.4772630 0.9650966 0.8805495 0.7411246 0.8404030 
      Liz     Brent 
1.0000000 0.6828100 

PageRank

PageRank is a variant of EigenCentrality, also assigning nodes a score based on their connections, and their connections’ connections. The difference is that PageRank also takes link direction and weight into account – so links can only pass influence in one direction, and pass different amounts of influence.

page.rank(pilotGraph)$vector
      Luke       Anne        Zee        Bob        Tim     Jazmin       Juan 
0.09260366 0.13238445 0.09814399 0.07592534 0.12618660 0.10777405 0.07780444 
    Kalani        Liz      Brent 
0.10170905 0.12280305 0.06466537 

Global Properties

Network Density

The proportion of all potential edges between vertices that actually exist in the network graph. It is an indicator of how well connected the vertices of the graph are.

edge_density(pilotGraph)
[1] 0.4666667

Average Path Length

The mean of the lengths of the shortest paths between all pairs of vertices in the network.

mean_distance(pilotGraph)
[1] 1.577778

Global Transitivity

Probability that the adjacent verticles of a given vertex are connected (e.g., triangles)

transitivity(pilotGraph)
[1] 0.7880184

Local Transitivity

Metric calculates the proportion of closed triangles that a vertex is part of, given the number of triangles it could be a part of

transitivity(pilotGraph, type="local")
 [1] 0.8095238 0.7222222 0.9333333 0.8000000 0.7857143 0.7857143 0.9333333
 [8] 0.7142857 0.7142857 0.8666667

Cliques

All vertices connected to each other.

largest_cliques(pilotGraph)
[[1]]
+ 6/10 vertices, named, from 397f698:
[1] Luke   Anne   Jazmin Tim    Brent  Liz   

[[2]]
+ 6/10 vertices, named, from 397f698:
[1] Juan   Anne   Jazmin Tim    Zee    Kalani

[[3]]
+ 6/10 vertices, named, from 397f698:
[1] Juan   Anne   Jazmin Tim    Zee    Liz   

Cliques of Any Size

max_cliques(pilotGraph)
[[1]]
+ 4/10 vertices, named, from 397f698:
[1] Bob    Luke   Anne   Kalani

[[2]]
+ 5/10 vertices, named, from 397f698:
[1] Bob   Luke  Anne  Brent Liz  

[[3]]
+ 5/10 vertices, named, from 397f698:
[1] Luke   Anne   Jazmin Tim    Kalani

[[4]]
+ 6/10 vertices, named, from 397f698:
[1] Luke   Anne   Jazmin Tim    Brent  Liz   

[[5]]
+ 6/10 vertices, named, from 397f698:
[1] Juan   Anne   Jazmin Tim    Zee    Kalani

[[6]]
+ 6/10 vertices, named, from 397f698:
[1] Juan   Anne   Jazmin Tim    Zee    Liz   

Special Relationships

Assortativity: Do birds of a feather flock together?

Do similar nodes connect to each other?

You can calculate assortativity based on categories (nominal), numeric values, or degree centrality

assortativity.nominal(UKfaculty, V(UKfaculty)$Group)
[1] 0.7053802

Recriprocity

What is the probability of an edge going both ways?

reciprocity(UKfaculty)
[1] 0.5875153

Community Detection

Where are connections between nodes more dense?

Fast Greedy Algorithm

Only work on undirected graphs

fastgreedy.community(as.undirected(UKfaculty))
IGRAPH clustering fast greedy, groups: 5, mod: 0.56
+ groups:
  $`1`
   [1]  2  8 11 15 18 19 21 25 29 31 34 35 39 41 43 46 57 58 79
  
  $`2`
   [1] 24 32 37 38 48 50 52 54 55 64 70
  
  $`3`
   [1]  1  3  4  9 17 36 44 45 53 59 60 61 62 73 74 75 78 81
  
  $`4`
  + ... omitted several groups/vertices

Betweenness Algorithm

Dividing into smaller pieces until identifies bridges between communities

edge.betweenness.community(UKfaculty)
IGRAPH clustering edge betweenness, groups: 14, mod: 0.12
+ groups:
  $`1`
  [1] 1
  
  $`2`
   [1]  2  3  4  5  7  8  9 10 12 13 16 17 18 19 21 23 25 27 28 29 31 33 35 37
  [25] 38 40 41 42 43 46 48 49 50 51 52 53 54 58 59 60 61 62 65 68 69 70 71 72
  [49] 73 74 75 76 77 78 79 81
  
  $`3`
  [1]  6 22 30 66
  + ... omitted several groups/vertices

Walk Trap Algorithm

x<-walktrap.community(UKfaculty)
length(x)
[1] 6
sizes(x)
Community sizes
 1  2  3  4  5  6 
15  8 25 25  2  6 
membership(x)
 [1] 1 4 1 1 2 3 3 4 5 3 4 3 3 6 4 3 1 4 4 6 4 3 3 2 4 6 3 3 4 3 4 2 3 4 4 1 4 4
[39] 4 3 4 3 4 1 1 4 3 2 3 4 6 2 1 4 2 6 4 4 1 5 1 4 3 2 3 3 2 3 3 4 3 3 1 1 1 3
[77] 3 1 4 6 1
plot(x,UKfaculty)

Plotting

Base Plotting

Layouts

# Plot the graph object pilotGraph in a circle layout
plot(pilotGraph, vertex.label.color = "black", layout = layout_in_circle(pilotGraph))

# Plot the graph object pilotGraph in a Fruchterman-Reingold layout 
plot(pilotGraph, vertex.label.color = "black", layout = layout_with_fr(pilotGraph))

# Plot the graph object pilotGraph in a Tree layout 
m <- layout_as_tree(pilotGraph)
plot(pilotGraph, vertex.label.color = "black", layout = m)

# Plot the graph object pilotGraph using igraph's chosen layout 
m1 <- layout_nicely(pilotGraph)
plot(pilotGraph, vertex.label.color = "black", layout = m1)

Vary Edge Thickness by Edge Weight/Attribute

m1 <- layout_nicely(pilotGraph)
plot(pilotGraph, 
        vertex.label.color = "black", 
        edge.color = 'black',
        edge.width = E(pilotGraph)$friendship,
        layout = m1)

Plotting by Attributes

You can conditionally assign colors based on some attribute and plot based off that attribute. igraph vertices have an attribute for color.

V(pilotGraph)$color <- ifelse(V(pilotGraph)$party == "Dem", "blue", "red")
plot(pilotGraph, vertex.label.color="black")

Plot Cliques

# Assign largest cliques output to object 'lc'
lc <- largest_cliques(pilotGraph)
# Create two new undirected subgraphs, each containing only the vertices of each largest clique.
gs1 <- as.undirected(subgraph(pilotGraph, lc[[1]]))
gs2 <- as.undirected(subgraph(pilotGraph, lc[[2]]))
# Plot the two largest cliques side-by-side
par(mfrow=c(1,2)) # To plot two plots side-by-side
plot(gs1,
     vertex.label.color = "black", 
     vertex.label.cex = 0.9,
     vertex.size = 0,
     edge.color = 'gray28',
     main = "Largest Clique 1",
     layout = layout.circle(gs1)
)
plot(gs2,
     vertex.label.color = "black", 
     vertex.label.cex = 0.9,
     vertex.size = 0,
     edge.color = 'gray28',
     main = "Largest Clique 2",
     layout = layout.circle(gs2)
)

Using a Real Dataset: ‘UK Faculty’

data("UKfaculty")

Can you inspect the UK

Depiction of assortativity

# select colors
colors = brewer.pal(4, "Dark2")
# assign colors to groups
V(UKfaculty)$color = sapply(V(UKfaculty)$Group, function(x) colors[x])

plot(UKfaculty, layout = layout_nicely(UKfaculty, dim = 2),
     vertex.color = V(UKfaculty)$color, vertex.frame.color = NA,
     vertex.label = NA, vertex.shape = 'square',
     vertex.size = 3.5, edge.arrow.size = 0.3, edge.curved = TRUE,
     edge.width = E(UKfaculty)$weight ^ 0.8,
     edge.color = rgb(0, 0, 0, alpha = 0.1))
title("UK Faculty Friendship Network (Directed)", cex.main = 1)